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We present an efficient and accurate algorithm for principal component analysis (PCA) of a large set of two- 
dimensional images, and, for each image, the set of its uniform rotations in the plane and its reflection. The 
algorithm starts by expanding each image, originally given on a Cartesian grid, in the Fourier-Bessel basis for 
the disk. Because the images are bandlimited in the Fourier domain, we use a sampling criterion to truncate 
the Fourier-Bessel expansion such that the maximum amount of information is preserved without the effect 
of aliasing. The constructed covariance matrix is invariant to rotation and reflection and has a special block 
diagonal structure. PCA is efficiently done for each block separately. This Fourier-Bessel based PCA detects 
more meaningful eigenimages and has improved denoising capability compared to traditional PCA for a finite 
number of noisy images. 



OCIS codes: 



100.0100, 100.3008, 180.0180 



1. Introduction 

Principal component analysis (PCA) is a classical 
method for dimensionality reduction, compression and 
denoising. The principal components are the eigenvec- 
tors of the sample covariance matrix. In image anal- 
ysis, the principal components are often referred to as 
"eigenimages" and they form a basis adaptive to the 
image set. In some applications, including all planar ro- 
tations of the input images for PCA is advantageous. 
For example, in single particle reconstruction (SPR) us- 
ing cryo-electron microscopy [1], the 3D structure of a 
molecule needs to be determined from many noisy 2D 
projection images taken at unknown viewing directions. 
PCA, known in this field as multivariate statistical anal- 
ysis (MSA) is often a first step in SPR [2]. Inclusion of 
the rotated images for PCA is desirable, because such 
images are just as likely to be obtained in the exper- 
iment, by in-plane rotating either the specimen or the 
detector. When all rotated images are included for PCA, 
then the eigenimages have a special separation of vari- 
ables form in polar coordinates in terms of radial func- 
tions and angular Fourier modes 0-0- It is easy to 
steer the eigenimages by a simple phase shift, hence the 
name "steerable PCA". Computing the steerable PCA 
efficiently and accurately is however challenging. 

The first challenge is to mitigate the increased com- 
putational cost associated with replicating each image 
multiple times. Efficient algorithms for steerable PCA 
were introduced in p, |7[ with computational complexity 
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almost similar to that of traditional PCA on the orig- 
inal images without their rotations. Current efficient 
algorithms for steerable PCA first map the images from 
Cartesian grid to polar grid, using, e.g., interpolation. 
Because the transformation from Cartesian to polar is 
not unitary, the eigenimages corresponding to images 
mapped to polar grid are not equivalent to transforming 
the original eigenimages from Cartesian to polar. 

The second challenge is associated with noise. The 
non-unitary transformation from Cartesian to polar 
changes the noise statistics. Interpolation transforms 
additive white noise to correlated (i.e., colored) noise. 
As a consequence, spurious eigenimages and eigenvalues 
corresponding to colored noise may arise in the eigen- 
analysis. The naive algorithm for steerable PCA that 
replicates each image multiple times also mistreats the 
noise: While the realization of noise is independent be- 
tween the original images, the realization of noise among 
duplicated images is dependent. This in turn can lead 
to noise-induced spurious eigenimages and to artifacts 
in the leading eigenimages. 

We present a new efficient and accurate algorithm for 
computing the steerable PCA that meets these chal- 
lenges. This is achieved by combining into the steer- 
able PCA framework a sampling criterion that was pro- 
posed by Klug and Crowther [8| in a different con- 
text of reconstructing a 2D image from its ID projec- 
tions. In order to avoid spurious information and noise 
amplification in the reconstruction, the images are ex- 
panded in Fourier-Bessel functions (alternatively, in pro- 
late spheroidal wave functions), while only keeping func- 
tions satisfying k-\-2q < k max + 2, where k index the an- 
gular frequencies and q index the radial frequencies (cf. 
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eqs. (dJ-©). k max is determined by the distribution of 
the radial lines. For example, if one has 2m uniformly 
spaced radial lines, k max = m — 1. This cut-off criterion 
is also useful for representing 2D images sampled on a 
Cartesian grid with Fourier-Bessel functions. 

2. Methods 

The images are spatially limited objects. The eigen- 
functions of the Laplacian in a disk of radius r max with 
vanishing Dirichlet boundary condition are the Fourier- 
Bessel functions. Hence, they form an orthogonal ba- 
sis to the space of squared-integrable functions over the 
disk, and it is natural to expand an object of radius 
r m ax in that basis. The functions we use to expand the 
spatially limited object are of the form 
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where R kq is the q th root of the equation 

Jk(2nRk q rmax) = 0, (2) 

and k and q satisfy k + 2q < k max + 2 [8] . The functions 
ip kq are normalized to unity, that is 
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A one-dimensional projection of ip kq contains k + 2q ze- 
ros asymptotically. The zeros are approximately equally 
spaced. Therefore, the functions ip kq for a given k + 2q 
can represent projections with the same resolution. The 
maximum resolution is k max = [7rr max - lj , where r max 
is the number of pixels along the radial line with = 0. 
The number of components satisfying k + 2q < k max is 
approximately ^fc^ a£C , which is smaller than the number 
of pixels inside the disk by only a factor of j. 

Given n images, 7i,...,/ n , the truncated Fourier- 
Bessel expansion of Ii is given by 

|^ kmax +2 — fc j ^ 

h{r,6)= J2 E 4> fe9 M). (4) 

q=l k= — k max 

If If denotes the rotation of image Ii by an angle a, 
then 

If (r, 9) = h{r,e - a) = £ < g e* te ^(r, 9). (5) 

k,q 

Because J-k(x) = (—l) k Jk(x), for real valued images 
a-k,q = (— l) fc a^. Also, under reflection a\ changes 
to a l _ k , and the reflected image, denoted 1J, is given 

by 

Il{r,9) = I(r,n- 9) = ^4, g J fe (r)e^" e ) (6) 

k,q 

= J2al q (-l) k J k (r)e- M . (7) 
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The sample covariance matrix built from the original 
images and N rotated (and reflected) copies is formally 
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To build C, instead of using the original images and their 
rotated copies, we use the expansion coefficients a\ : 
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The non-zero entries of C correspond to k = k! , ren- 
dering its block diagonal structure. Also, it suffices to 
consider k > 0, because C (M ) 5 ( M /) = C ( _ M)) (_ M /). 
Thus, the covariance matrix can be written as the direct 
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where C<*>, = 1 £? =1 *K,K,')*} is by itself a sam- 
pie covariance matrix of size pk = [ fe ^^+ 2 ~ fc j . We used 
the k -\-2q criterion to avoid aliasing or sampling at fre- 
quencies higher than Nyquist frequency. Moreover, if 
the images are corrupted by independent white (uncor- 
rected) noise, then each is also affected by indepen- 
dent white noise, because the Fourier-Bessel transform 
is unitary (up to grid discretization). 



Algorithm 1: Fourier-Bessel Steerable PC A 
(FBsPCA) 

Require: n pre- whitened images ii, . . . , I n . 
1: For each U compute a l kjq using the Fourier-Bessel 
transform. 

2: For each k = 0, 1, . . . , kmax compute the covariance 
matrix C (fe) . The size of the block p k = [ fcmax 2 +2 ~ fc j . 

3: For each k compute the eigenvalues of C^ k \ in 
descending order \\ > \l • • • > \ v k k and the 
associated eigenvectors h \ , h k , . . . , h p k k . 

4: The eigenimages V kl (0 < k < k ma x, 1 < / < Pk) are 
linear combinations of the Fourier-Bessel functions, 
i.e., V kl = tp k h l k , where ^ = bp k \ • • • , i/> kpk ]- 

5: (optional) Use [3] to estimate the number of 
components to choose for each C^ k \ 



It can be verified that the computational complexity 
of FBsPCA is 0(nr^ aa , + r^ax)) whereas the compu- 
tational complexity of the traditional PCA (applied on 
the original images without their rotational copies) is 
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3. Results 

In the experiments, we simulated 10 4 clean and noisy 
(corrupted by additive white Gaussian noise) projection 
images of E. coli 70S ribosome (Fig. [I]). The running 
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Fig. 1: Simulated 70S ribosome projection images with 
different signal to noise ratio. 

time of traditional PCA was 695 seconds, compared to 
85 seconds for FBsPCA, both implemented in MATLAB 
on a machine with 2 Intel(R) Xeon(R) CPUs X5570, 
each with 4 cores, running at 2.93 GHz. The top 5 
eigenimages for noisy images agree with the eigenim- 
ages from clean projection images (Fig. [2fa) and Fig. 
Efc)). The eigenimages generated by FBsPCA are much 
cleaner than the eigenimages from the traditional PCA 
(Fig. [2fc) and Fig. Efd)). We compared the eigenval- 
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Fig. 2: Eigenimages for 10 4 simulated 705 ribosome 
projection images, (a) FBsPCA and (b) traditional 

PCA from clean images, (c) FBsPCA and (d) 
traditional PCA from noisy images (SNR= -^). The 
eigenimages are ordered by the eigenvalues. Image size 
is 129 x 129 pixels and r max = 55 pixels. 



ues for the same ribosome projection images with dif- 
ferent signal to noise ratios (Fig. [3j). As the noise vari- 
ance (a 2 ) increases, the number of signal components 
that we are able to discriminate decreases. We use the 
method in [9| to estimate the noise variance and deter- 
mine number of components automatically. With the 
estimated noise variance a 2 , components with eigenval- 
ues X{ > (T 2 (l + + where j k = ^ for k = 
and 7fc = |^ for k > 0, are chosen. 

We compared the effects of denoising with FB- 
sPCA and traditional PCA. The asymptotically opti- 
mal Wiener filters [10] designed with the eigenvalues and 
eigenimages from FBsPCA (PCA, resp.) are applied to 
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Fig. 3: Eigenvalues for and C^ 10 ) for simulated 
projection images with different signal to noise ratios. 



the noisy images. The number of signal components 
determined with PCA is 62, whereas the number of sig- 
nal components for FBsPCA is 320 for 10 4 noisy images 
with SNR= and r max = 55 pixels. Fig. [4] shows that 
FBsPCA gives better denoised image. 



(a) 



(b) 



(c) 



(d) 



Fig. 4: Denoising by FBsPCA and PCA. n = 10 4 and 

^max = 55 pixels, (a) clean image, (b) noisy image 
(SNR= ^). (c) FBsPCA denoised. (d) PCA denoised. 



4. Summary 

In summary, we have combined a sampling criterion pro- 
posed in [8[ into the framework of steerable PCA. The 
unitary transform using Fourier-Bessel functions with 
the k + 2q cut-off criterion keeps the statistics of white 
noise unchanged. This sampling criterion obtains max- 
imum information from a set of images with its rotated 
and reflected copies while preventing the introduction 
of spurious detail and amplification the noise by effects 
which are beyond resolution. Instead of building the in- 
variant covariance matrix from the original images and 
their rotated and reflected copies, we compute the co- 
variance matrix from the expansion coefficients. It has 
a special block diagonal structure that allows us to per- 
form PCA on different angular frequencies separately. 
This greatly reduces the computational complexity. The 
steerable PCA projects data into subspaces with differ- 
ent angular Fourier modes. Since we reduce the dimen- 
sionality for PCA by projecting data into subspaces, we 
are able to retrieve more signal components from the 
noisy data and therefore achieve better denoising com- 
pared with traditional PCA. 
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